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We report a measurement of the single top quark production cross section in 2.2 fb _1 of pp col- 
lision data collected by the Collider Detector at Fermilab at ^fs = 1.96 TeV. Candidate events are 
classified as signal-like by three parallel analyses which use likelihood, matrix element, and neural 
network discriminants. These results are combined in order to improve the sensitivity. We ob- 
serve a signal consistent with the standard model prediction, but inconsistent with the background- 
only model by 3.7 standard deviations with a median expected sensitivity of 4.9 standard devia- 
tions. We measure a cross section of 2.2l°'g(stat + sys) pb, extract the CKM matrix element value 
\V tb \ = 0.88±55 ;Jf (stat + sys) ± 0.07(theory), and set the limit \V tb \ > 0.66 at the 95% C.L. 
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The top quark was discovered by the CDF and DO col- 
laborations in 1995 1] in the strong interaction pp — > 
tt + X. Since then, a comprehensive program of mea- 
surements has brought more precise knowledge of the top 
quark's mass, pair-production cross section, and a num- 
ber of its decay properties Q. The evidence strongly 
suggests that the particle observed in 1995 is the SU (2) 
partner of the bottom quark and that it decays nearly 
100% of the time into Wb with a very short lifetime. The 
weak couplings of the top quark are less well constrained, 
except that \ V tb \ 2 > |14d| 2 + |14 s | 2 H- Requiring that the 
3x3 Cabibbo-Kobayashi-Maskawa (CKM) matrix is uni- 
tary implies that \Vtb\ — 1 [2}. With a matrix of higher 
rank, though, \Vtb\ could be small without measurably 
changing the t — > Wb branching ratio. Production of 
single top quarks provides a direct measurement of |Vt&| 
and a test of the 6-quark content of the proton. 

Top quarks are expected to be produced singly, as 
shown in Fig. [T] The combined s + i-channel cross sec- 
tion is predicted at next-to-leading order (NLO) to be 
a s t = 2.86 ± 0.36 pb The small signal cross section 
and the presence of only one top quark in the hnal state 
make the separation of the signal from the large back- 
ground challenging. Since the signal has very similar fi- 
nal states to the standard model Higgs boson production 
process WH — > ivbb, the methods of this analysis can be 
used to search for the Higgs boson. 

Recently, the DO collaboration has reported evidence 
for single top quark production using 0.9 fb _1 of data [BJ 
while measuring a cross section of a s t — 4.7±1.3 pb. This 
Letter reports a significantly more precise measurement 
of a s t in 2.2 fb _1 of pp collisions at y/s = 1.96 TeV using 
the CDF II detector. 

The CDF II detector [6] is a general purpose apparatus 
located at the Tevatron collider at Fermilab. The detec- 
tor consists of a solenoidal charged particle spectrometer 
which includes a silicon microstrip detector array sur- 
rounded by a cylindrical drift chamber in a 1.4 T axial 
magnetic field. The energies of electrons and jets are 
measured with segmented sampling calorimeters. Sur- 
rounding the calorimeters arc layers of steel instrumented 
with planar drift chambers and scintillators used for 
muon identification. 

Three distinct trigger algorithms are employed to se- 
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FIG. 1: Representative Feynman diagrams of single top quark 
production. Figures (a) and (b) are t-channel processes, and 
figure (c) is the s-channel process. 

lect the data used in this analysis: a high pt electron trig- 
ger, a high pt muon trigger, and a trigger that requires 
large missing transverse energy with either an energetic 
electromagnetic cluster or two separated jets @, H| • 

Events are further selected by requiring the presence 
of an isolated electron or muon candidate with pr > 
20 GeV/c, large missing transverse energy J^y > 25 GeV, 
and either two or three jets each with Et > 20 GeV 
and |7y| < 2.8. The jets are identified by a fixed-cone 
algorithm with radius AR = \/{Arj) 2 + (A</>) 2 = 0.4, 
and their energies are corrected for instrumental ef- 
fects 0. At least one of the jets is required to have 
a displaced vertex (b tag) as identified by the SECVTX 
algorithm [l"o| . This b tag preferentially selects jets con- 
taining B hadrons. 

In order to reduce the Z+jets, tt, and diboson back- 
grounds, candidate events with a second charged lepton 
are rejected. Cosmic ray and photon candidates are iden- 
tified and removed. Multijet background events without 
a leptonic W decay ("non-W") are rejected with specific 
selection requirements [HI, HH • 

The diboson ( WW, WZ ZZ) and tt event yields are 
predicted using pythia [13| Monte Carlo (MC) samples 
normalized to the theoretical cross section 
processes in which a vector boson is produced in associ- 
ation with one or more jets (Z or W+jets) are generated 
with ALPGEN [r| using pythia's parton showering and 
underlying event model. The IF+jets samples are nor- 
malized to the measured data using events with exactly 
one, two, or three jets. A normalization factor of 1.4±0.4 
is applied to ALPGEN's prediction for the fraction of Wbb 
and Wcc events. This factor is estimated by comparing 
the flavor content in 6-tagged W+l jet events in the data 
to the prediction from simulation. The background from 
events with mistakenly 6-tagged light-flavor jets ( Wjj) is 
estimated by measuring the rate of such mistags in mul- 
tijet data [lGj. The mistag rate is then applied to the 
IF+jets data samples before b tagging. The contribu- 
tions to the data samples from non-W jj sources are sub- 
tracted from the prediction Multijet non-IF events 
typically have lessor than events containing W bosons. 
By using templates for non-VF and IF+jets, we fit thcEr 
distribution and extract the non-IF fraction in the high 
$t signal region. The kinematic properties of the non- 
W events are modeled using data events and VF+jets is 
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modeled using MC simulated events. The observed event 
yields and corresponding predictions are given in Table[I] 



TABLE I: Background composition and predicted number of 
single top events in 2.2 fb^ 1 of CDF Run II data with at least 
one 6-tagged jet. 



Process 


W + 2 jets 


W + 3 jets 


s-channel signal 


40.3 


± 


5.8 


13.1 


± 


1.9 


t-channel signal 


60.8 


± 


8.9 


17.9 


± 


2.6 


Wbb 


451.1 


± 


136.0 


138.0 


± 


41.7 


Wcc + Wcj 


372.5 


± 


114.8 


103.2 


± 


31.8 


Wjj 


337.1 


± 


41.9 


101.6 


± 


12.8 


ti 


142.0 


± 


20.3 


327.8 


± 


46.6 


Non-IT 


60.5 


± 


24.2 


21.0 


± 


8.4 


Diboson 


61.1 


± 


6.2 


20.4 


± 


2.1 


Z+jets 


25.5 


± 


3.8 


10.5 


± 


1.5 


Total prediction 1550.9 ± 256.6 753.5 ± 87.6 


Observed 


1546 


719 





Single top events are simulated using the tree-level 
matrix-element generator MADEVENT [18j. The two t- 
channel processes of Figs. QJa) and HJb) are combined 
to match the event kinematics as predicted by a fully 
differential NLO calculation (3. fl9l] . 

The expected standard model signal-to-background ra- 
tio for selected events is ~7% in the two-jet sample and 
~5% in the three-jet sample. The uncertainties on the 
background predictions are larger than the expected sig- 
nals; therefore, we have developed three powerful dis- 
criminants to distinguish signal from background events. 
The predicted distributions of each discriminant are fit 
to the data to extract the single top production cross sec- 
tion. All analyses use the same event selection and were 
optimized with the signal region blinded. 

Wjj, Wcc and Wcj events do not contain 6-quark jets, 
but constitute ~40% of the estimated background af- 
ter imposing a b tag requirement. As part of all three 
discriminants we employ a jet-flavor separating variable, 
b nn , constructed using the neural network tool NEU- 
robayes [20(, which is trained to distinguish b jets from 
charm and light-flavor jets based on secondary vertex 
tracking information [11(. The usage of b nn leads to an 
improvement in sensitivity of 15 to 20% in each analysis. 
Likelihood Function Discriminant (LF): A pro- 



jective likelihood technique [Tfl, |2l| is used to com- 
bine information from several input variables to optimize 
the separation of the single top signal from the back- 
grounds. Two likelihood functions are created, one for 
two-jet events, C 2 j, and one for three-jet events, £3^. 
The input variables used for £ 2 j are b nn , Q x 77 [22j, the 
invariant mass of the tvb system M^, the total scalar 
sum of transverse energy in the event Ht, cos Q\- [23j |. 
the dijet mass M,-,-, and the i-channel matrix element. 



The matrix element used here is computed using four- 
vectors from the event after kinematically constraining 
Mi v = M\v and M^b = m t , where Mw and m t are the 
W and top quark pole masses in the matrix element. The 
Mw constraint introduces a quadratic ambiguity in the 
z component of the neutrino momentum; we choose the 
solution with the smaller \p"\. 

For events with three jets, ten input variables are used 
to construct £3^-: b nn , Q x 77, M(, v b, cos Mjj n of the 
two jets not selected as the b from top decay, the num- 
ber of b tags, the smallest AR between any two jets, the 
smallest pr of the three jets, pr(W), and Et of the jet 
selected as coming from the b from the top quark decay. 
The 6-quark jet is chosen using a linear combination of 
the jet Et and the x resulting from the comparison of 
the kinematically constrained jet energy and the mea- 
sured jet energy. 

Matrix Element Discriminant (ME): The matrix 
element method relies on the evaluation of event proba- 
bility densities for signal and background processes based 
on calculations of the standard model differential cross 

ET 
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We construct these probability densi- 
ties for each process for each event given their measured 
quantities x by integrating the appropriate differential 
cross section da{y) /dy over the underlying partonic quan- 
tities y, convolved with the parton distribution functions 
(PDFs) and detector resolution effects: 



p& = E / 

perm. 



dy 



f{qi)f{q2)d qi dq 2 W{x,y)dy. (1) 



The PDFs (/(<Zi) and /fe)) take into account the fla- 
vors of the colliding quark and anti-quark. We use the 
CTEQ PDF parameterization ^jfy. The detector resolu- 
tion effects are described by a transfer function W{x,y) 
relating x to y. The momenta of electrons, muons, and 
the angles of jets are assumed to be measured exactly. 
W(x,y) maps parton energies to measured jet energies 
after correction for instrumental effects @. This map- 
ping is obtained by parameterizing the jet response in 
fully simulated MC events. The definition of the proba- 
bility densities includes possible permutations of match- 
ing jets with partons. The integration is performed over 
the energy of the partons and p v z . We calculate the ma- 
trix element for the event probability at tree-level using 
madgraph [26| . Event probability densities are com- 
puted for the s-channel and t-channel signal as well as 
Wbb, Wcc, Wcj, Wjj, and ti background hypotheses. In 
the specific case of the ti matrix clement, additional in- 
tegrations are performed over the momenta of particles 
not detected. 

The event probability densities are combined into an 
event probability discriminant: EPD = -P s ignai/(fsignai + 
-^background)- To better classify signal events that contain 
b jets, we incorporate the output b nn of the neural net- 
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work jet-flavor separator into the final discriminant: 

bnn ' Pst 



EPD - 



b nn (Pst + Ptt + P\Vbb ) + (1 ~ frnn) (Pwcc + P\Vcj + Pw j j ) 



(2) 

Both signal channels are combined to one single top 
probability density P st = P s - C hannel + -Pt-channci- 

Neural Network Discriminant (NN): The third 
multivariate approach [ill ] employs neural networks, 
which have the general advantage that correlations be- 
tween the discriminating input variables are identified 
and utilized to optimize the separation power between 
signal and background. The networks are developed us- 
ing the NEUROBAYES analysis package [2(j, which com- 
bines a three-layer feed-forward neural network with a 
complex and robust preprocessing of the input variables. 
Bayesian regularization techniques are utilized to avoid 
over-training. 

Four separate networks are trained to identify different 
signals in distinct samples using simulated events from 
the common samples described previously. An s-channel 
signal is used for training on events with two 6-tagged 
jets. A i-channel signal is used for the two-jet sample 
with a single b tag and for the three-jet samples with one 
or two b tags. The networks use 11 to 18 input variables. 
The most important ones are Mi v h, b nn , Mjj, Q x 77, 
cosfljj, the transverse mass of the W boson, and Ht- 
The input variables are selected from a large list using 
an automated evaluation during the preprocessing step 
before the network training. In an iterative process, we 
determine those variables whose removal would cause a 
significant loss in separation power between signal and 
background and use them for network training. 

Combination: We studied two methods to combine 
the cross section fit results. The best linear unbiased es- 
timator (BLUE) [27| technique optimizes the coefficients 
of a linear combination using the uncertainties and cor- 
relations of the three individual analyses: LF, ME, and 
NN. The correlation coefficients between the analyses are: 
LF-ME: 59%; LF-NN: 74%; ME-NN: 61%. In another 
combination approach, a "super analysis" is built based 
on the outcomes for each event in each of the three indi- 
vidual analyses. The super-discriminant method uses a 
neuro-evolution network [28| trained to separate the sig- 
nal from the background based on the discriminant out- 
puts of the three analyses. With the super-discriminant 
analysis we improve the sensitivity by 10% over the best 
individual analysis, and we use it to quote our final re- 
sults. As a cross-check, BLUE yields a 7% sensitivity 
improvement. 

Before unblinding the data, the MC simulation of each 
input variable and the discriminant outputs were checked 
in data control samples depleted in signal. These are the 
lepton + 6-taggcd four-jet sample, which is enriched in ti 
events, and the two- and three-jet samples in which no jet 
is b tagged. The latter are high-statistics samples with 
similar kinematics to the 6-tagged signal samples. The 



data distributions in the control samples are described 
well by the MC simulation. 

Figure [2] shows the distributions of the individual anal- 
yses' discriminants and the super-discriminant. We cal- 
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FIG. 2: Discriminant distribution for all channels combined 
of the (a) LF analysis, (b) ME analysis, (c) NN analysis and 
(d) combined super-discriminant analysis. Points with error 
bars indicate the data. The predicted signal and background 
distributions are shown as stacked histograms. The insets 
show the candidate events in the signal regions. A summary 
of all results is shown in (e). 

culate the probability (p-value) @ of the background- 
only discriminant distribution to fluctuate to the ob- 
served data or more which is then converted into signal 
significance under a Gaussian assumption. All sources 
of systematic uncertainty are included and correlations 
between normalization and discriminant shape changes 
are considered. Uncertainties in the jet energy scale, b- 
tagging efficiencies, lepton identification and trigger effi- 
ciencies, the amount of initial and final state radiation, 
PDFs, factorization and renormalization scale, and MC 
modeling have been explored and incorporated in this 
combination and all individual analyses. 

We interpret the excess of signal-like events over the 
expected background as strong evidence for single top 
production with a signal significance of 3.7 standard de- 
viations, with a sensitivity, defined to be the median 
expected significance, of 4.9 standard deviations. The 
most probable value of the combined s-channel and t- 



channel cross sections is a s t 



2.2 



+0.7 
-0.6 



pb for a top 



quark mass of 175 GeV/c 2 which is consistent with 
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the cross check result, obtained from BLUE, a st = 
2.3 ± 0.7 pb. The dependence on the top quark mass 
is +0.02 pb/(GeV/c 2 ). From the cross section mea- 
surement at m t = 175 GeV/c 2 , we obtain \V t b\ — 
0.88t°;^(stat. + sys.) ± 0.07(theory[|) and limit \V tb \ > 
0.66 at the 95% C.L. assuming a flat prior in \V t b\ 2 from 
to 1. This is the most precise direct measurement of 
\V t b\ to date. 
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